On Pattern Occurrences in a Random Text
نویسندگان
چکیده
Consider a given pattern H and a random text T of length n. We assume that symbols in the text occur independently, and various symbols have different probabilities of occurrence (l.e., the so called asymmetric Bernoulli modeQ. We are concerned with the probability of exactly T occurrences of H in the text T. We derive the generating function of this probability, and show that asymptotically it behaves as anrpfi-r-l, where a is an explicitly computed constant, and PH < 1 is the root of an equation depending on the structure of the pattern. We then extend these findings to random patterns.
منابع مشابه
ON PATTERN OCCURRENCES IN A RANDOM TEXTApril
Consider a given pattern H and a random text T of length n. We assume that symbols in the text occur independently, and various symbols have diierent probabilities of occurrence (i.e., the so called asymmetric Bernoulli model). We are concerned with the probability of exactly r occurrences of H in the text T. We derive the generating function of this probability, and show that asymptotically it...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملWorst Case Efficient Single and Multiple String Matching in the RAM Model
In this paper, we explore worst-case solutions for the problems of single and multiple matching on strings in the word RAM model with word length w. In the first problem, we have to build a data structure based on a pattern p of length m over an alphabet of size σ such that we can answer to the following query: given a text T of length n, where each character is encoded using log σ bits return ...
متن کاملFrequency of Pattern Occurences in a (DNA) Sequence
Consider a given pattern H and a random text T oflength n. We assltme that consecutive symbols in the texl are generated either independently or with a Markovian dependency, i.e., we stItely both the so called Bernoulli model and the Markovian model. OUf goal is to assess the limiting distribution of the frequency of the pattern occurrences ln a random sequence. Overlapping copies of a pattern ...
متن کاملEfficient String Matching with k Mismatches
Given a text of length n, a pattern of length m and an integer k, we present an algorithm for finding all occurrences of the pattern in the text, each with at most k mismatches. The algorithm runs in 0{k[mlQgTn + n) time. 1. INTEODUCTION The problem of string matching xuith k misTnatchss is defined as follows. Suppose we are given a text of length n , a pattern of length m and an integer k . Fi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Process. Lett.
دوره 57 شماره
صفحات -
تاریخ انتشار 1996